A Machine Learning Architecture for Optimizing Web Search Engines

نویسندگان

Justin Boyan

Dayne Freitag

Thorsten Joachims

چکیده

Indexing systems for the World Wide Web, such as Lycos and Alta Vista, play an essential role in making the Web useful and usable. These systems are based on Information Retrieval methods for indexing plain text documents, but also include heuristics for adjusting their document rankings based on the special HTML structure of Web documents. In this paper, we describe a wide range of such heuristics--including a novel one inspired by reinforcement learning techniques for propagating rewards through a graph--which can be used to affect a search engine’s rankings. We then demonstrate a system which learns to combine these heuristics automatically, based on feedback collected unintrusively from users, resulting in much improved rankings.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

A New Hybrid Method for Web Pages Ranking in Search Engines

There are many algorithms for optimizing the search engine results, ranking takes place according to one or more parameters such as; Backward Links, Forward Links, Content, click through rate and etc. The quality and performance of these algorithms depend on the listed parameters. The ranking is one of the most important components of the search engine that represents the degree of the vitality...

متن کامل

A Novel Architecture for Detecting Phishing Webpages using Cost-based Feature Selection

Phishing is one of the luring techniques used to exploit personal information. A phishing webpage detection system (PWDS) extracts features to determine whether it is a phishing webpage or not. Selecting appropriate features improves the performance of PWDS. Performance criteria are detection accuracy and system response time. The major time consumed by PWDS arises from feature extraction that ...

متن کامل

A Machine Learning Approach to Building Domain-Speci c Search Engines

Domain-speci c search engines are becoming increasingly popular because they o er increased accuracy and extra features not possible with general, Web-wide search engines. Unfortunately, they are also di cult and timeconsuming to maintain. This paper proposes the use of machine learning techniques to greatly automate the creation and maintenance of domain-speci c search engines. We describe new...

متن کامل

A Machine Learning Approach to Building Domain-Specific Search Engines

Domain-specific search engines are becoming increasingly popular because they offer increased accuracy and extra features not possible with general, Web-wide search engines. Unfortunately, they are also difficult and timeconsuming to maintain. This paper proposes the use of machine learning techniques to greatly automate the creation and maintenance of domain-specific search engines. We describ...

متن کامل

Yarrow: A Real-Time Client Side Meta-Search Learner

In this paper we report our research on building Yarrow an intelligent web meta-search engine. The predominant feature of Yarrow is that in contrast to the lack of adaptive learning features in existing metasearch engines, Yarrow is equipped with a practically efficient on-line learning algorithm so that it is capable of helping the user to search for the desired documents with as little feedba...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره شماره

صفحات -

تاریخ انتشار 1996

A Machine Learning Architecture for Optimizing Web Search Engines

نویسندگان

چکیده

منابع مشابه

A New Hybrid Method for Web Pages Ranking in Search Engines

A Novel Architecture for Detecting Phishing Webpages using Cost-based Feature Selection

A Machine Learning Approach to Building Domain-Speci c Search Engines

A Machine Learning Approach to Building Domain-Specific Search Engines

Yarrow: A Real-Time Client Side Meta-Search Learner

عنوان ژورنال:

اشتراک گذاری